EXHIBIT C 



2004-06-03 16:55 TO-STAAS 



FROM-IKEUCHI-SATO & PARTNER PATENT ATTORNEYS T-903 P. 004/011 F-623 



input part 34 is converted into a feature parameter for speaker 
verification and sent out to the verification distance calculating part 36A 
and the speaker distance calculating part 36B. The verification distance 
calculating part 36A calculates the distance d id between the voice 
5 template of the speaker corresponding to the identity claim and the 
feature parameter of the input voice. 

On the other hand, the speaker distance calculating part 36B 
calculates the distances & lt d 2> ...and d N between the voice templates of N 
other registered speakers and the feature parameter of the input voice 

10 and delivers the results to the distribution estimating part 37. The 
distribution estimating part 37 estimates a probability distribution 
function F(d) of the speaker distances between the voices of the registered 
speakers other than the speaker corresponding to the input identity claim 
and the input voice, using the calculated N distances d lt d*, ... and d„ with 

15 respect to the other registered speakers and delivers the result to the 
speaker judging part 39. 

The estimation of the probability distribution function F(d) leads 
to a proba bility density function f(d). The area of the function shown in 
the probability density function f(d) indicates a probability value. The 

20 relationship between the probability distribution function F(d) and the 
probability density function f(d) is that as shown in Equation 1. 
Equation 1 
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Therefore, the speaker judging part 39 judges the speaker base d 
on the probability de nsity function f(d) in the following manner. When 
the speaker distance with respect to the speaker corresponding to the 
identity claim is within the region defined by the level of sig n ificance p of 
regardin g an unauthorized person as the person specified by the ID, 
which is previously designated in the false acceptance error rate input 
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part 38, it is determined that the speaker is the person specified by the ID. 
When the distance d^ is not within the region, it is determined that the 
speaker is not the person specified by the ID. In the determination 
based on the probability distribution function F(d), when F(d M ) < p is 
5 satisfied, the speaker is the person specified by the ID. Wh en F(dJ) ^ p 
is satisfied, the speaker is not the person specified by the ID. 

Fig. 4 shows a diagram illustrating the method for judging the 
speaker by the speaker judging part 39. In the case where the 
probability density function f(d) is already obtained, the hatched region in 

10 the Fig. 4 corresponds to the region defined by the level of significance p 
of regarding an unauthorized person as the person specified by the ID. 
More specifically, the level of significance p of regarding an unauthorized 
person as the person specified by the ID is specified to determine that the 
speaker is the person specified by the ID when the distance d id is in the 

15 range in which the level of significance of regarding an unauthorized 
person as the person specified by the ID is smaller than the designated 
level of significance p. 

Next, Fig. 5 is a block diagram of a speaker verification apparatus 
of one example of the present invention when verifying the speaker. 

20 Referring to Fig. 5, numerals 51A and 51B denote DP matching parts. 
Numeral 52 denotes a statistic calculating part. Numeral 53 denotes a 
speaker judging part. Numeral 54 denotes a false acceptance error rate 
input part. 

In Fig. 5, similarly to Fig. 3, an identity claim is input to the ID 
25 input part 31 at the time of using a system. Then, the speaker template 
selecting part 32 selects a template corresponding to the identity claim 
from templates of a plurality of speakers that are previously registered in 
the speaker template storing part 33 and sends the selected template to 
the DP matching part 51A. At the same time, the templates of the 
SO registered speakers other than the speaker corresponding to the identity 
claim are sent out to the DP matching part 5 IB. Herein, "DP" stands for 
dynamic programming. 
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Next, in the voice analyzing part 35, a voice input to the voice 
input part 34 is converted into a feature parameter for speaker 
verification and sent out to the DP matching calculating parts 51A and 
51B. The DP matching part 51A calculates the distance d id between the 
5 voice template of the speaker corresponding to the identity claim and the 
feature parameter of the input voice. 

On the other hand, the DP matching part 51B calculates the 
distances d u d 2 , ...and d N between the voice templates of N other 
registered speakers and the feature parameter of the input voice, and 
10 delivers the results to the statistic calculating part 52. The statistic 

calculating part 52 estimates the average n and the standard deviation a 
of the speaker distances, using the calculated N distances & u d z , ...and d N 
with respect to the other registered speakers, and delivers the 
estimations to the speaker judging part 53, The speaker judging part 53 
15 defines a normal distribution using the average ji and the standard 
deviation a of the distances with respect to the other registered 
speakers. 

If the probability distribution is a normal distributio n, a 
probability distribution function F(d) in a point a * a away from the 

20 a verage u can be determined by a . Therefore, whether or not the 

speaker is the person specified by the ID can be determ ined by examining 
whether or not the verification distance d id is in a region where d id is eq ual 
to or smaller than pi — a « a) in order to determine whether or not the 
verification distance d id with resect to the input voice is within the region 

25 defined by the previously designated level of significance p of rega rding 
an unauthorized person as the person specified by the ID. More 
specifically, (p. — q * q ) and d id are compared and the determination is 
performed as follows. When d id is equal to or smaller than Qi — q « a ), it 
is determined that the speaker is the person specified by the I D. When 

30 d id is larger than (p, — a ■ a ), it is determined that the speaker is not the 
person specified by the ID. In the case where it is assumed that the 
probability distribution is a normal distribution, the false acceptance 
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error rate input part 54 inputs a corresponding to the level of 
significance p of regarding an unauthorized person as the person specified 
by the ID beforehand. 

In this embodiment, the feature parameters are registered in the 
5 form of templates beforehand, and the probability distribution with 
respect to other registered speakers is estimated based on the speaker 
distances obtained by DP matching. The present invention is not limited 
to this method. For example, the probability distribution can be 
estimated based on a probability value output from a probability model 

10 such as Hidden Markov Model. 

Furthermore, in the speaker template storing part 33, speakers 
may be classified by the gender beforehand- When the speaker 
corresponding to the identity claim is male, the speaker templates of 
other male speakers are used for estimation of the probability 

15 distribution. When the speaker corresponding to the identity claim is 
female, the speaker templates of other female speakers are used for 
estimation of the probability distribution. Thus, the error rate of the 
probability distribution becomes closer to the error rate obtained from the 
normal distribution function table. (The identity claim is something 

20 which indicates a specific individual such as a name). 

Furthermore, in this embodiment, the probability distribution of 
the speaker distances is estimated as a single normal distribution. 
However, the probability distribution can be estimated as a mixed normal 
distribution defined by weighting addition of a plurality of normal 

25 distributions or other general probability distributions. (This is not 

necessarily limited to the distribution of other registered speakers, and 
other speakers can be prepared for the calculation of the distribution.) 

Next, the effects of this embodiment are confirmed by the results 
of the following experiments. First, Fig. 6 is a graph showing the results 

30 of verification of 15 male speakers using the speaker verification method 
of this embodiment. 

In Fig. 6, the horizontal axis indicates a: obtained from the 
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WHAT IS CLAIMED IS: 

1. A speaker verification apparatus comprising: 

an identity claim input part to which an identity claim is input; 
5 a speaker selecting part for selecting voice information of a 

registered speaker corresponding to the identity claim input to the 
identity claim input part; 

a speaker storing part for storing voice information of speakers; 
a voice input part to which a voice is input; 
10 a voice analyzing part for analyzing the voice input to the voice 

input part; 

a sneaker dis tance calculating part for calculating a verification 
distance between a feature parameter of the input voice and that of the 
voice of the registered speaker and the sneaker distances between a 
15 feature parameter of the input voice and those of the voices of speakers 
other than the registered speaker that are stored in the sneaker sorting 
part, based on the analysis results of the voice analyzing part and the 
voice information stored in the speaker storing part ; and 

a speaker judging part for determining whether or not the input 
20 voice matches the registered speaker corresponding to the input identity 
claim, 

the speaker verification apparatus further comprising: 
a false acceptance error rate input part to which a false 

acceptance error rate is input as a th reshold, the false acceptance error 
25 rate being prede termined bv a system manager or a user or adjustable 

depending on performance ; and 

a distributio n estimating part for obtaining a probability 

distribution of int erspeaker distances based on the speaker distances 

calculated in the sneaker distance calculating part; 
30 wherein the speake r judging p art HgtPrm mes that the innut voice 

is the voice of the regis tered person specified bv the identity claim, in the 

case where the verifi cation distance calculated in the speaker fl ipt.»Tirui 
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calculating part is included in a region defined bv the input false 
acceptance error rate in the probability distribution of the interspeaker 
distances. 

5 2. The speaker verification apparatus accord ing to claim 1. 

wherein it is assumed that the probability distribution of the 
speaker distances is a normal distribution function, and 

the speaker judging part determines that the input voice is the 
voice of the registered person specified bv the identity claim, in the case 
10 where the verification distance calculate d in the sneaker distance 
calculating part is included in a region defined bv the input false 
acceptance error rate in the probability distribution of the speaker 
distances obtained from the normal distribution function. 

15 3. The speaker verification apparatus according to claim 1, 

wherein the probability distribution of the speaker distances is 
obtained for each gender. 



4- The speaker verification apparatus according to claim 1. 
20 wherein the probability distribution of the sneaker distances is 

obtained as a weighting addition of a plurality of normal distributions. 

5. A method for verifying a speaker comprising: 
inputting an identity claim; 
25 selecting voice information of a registered speaker corresponding 

to the input identity claim; 

inputting a voice of the speaker; 
analyzing the input voice; 

calculating a verification distance between a feature parameter of 
30 the input voice and that of the voice of the registered spoakcr and the 
speaker distances bet ween a feature parameter of the input voice and 
those of voices of speakers other than the registered sp eaker, based on 
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the analysis results and the voice; and 

determining whether or not the input voice matches the 
registered speaker corresponding to the input identity claim, 

the method further comprising" 
5 inputting a false acceptance error rate as a threshold, the false 

acceptance error being predetermined bv a system manager or a user or 
adjustable depending on performance ; and 

obtaining a probability distribution of the interspeaker distances 
based on the calculated speaker distances ; 
10 wherein it is determined that the input voice is the voice of the 

registered person specified bv the identity claim, in the case where the 
calculated verification distance is included in a region defined bv the 
input false acceptance error rate in the probability distribution of the 
interspeaker distances . 

15 

6. A computer- re ad able recording medium storing a program to be 
executed by a computer, the program comprising- 
inputting an identity claim; 

selecting voice information of a registered speaker corresponding 
20 to the input identity claim; 

inputting a voice of the speaker; 
analyzing the input voice; 

calculating a verification distance between a feature parameter of 
the input voice and that of the voice of the registered apnakor and the 
25 speaker distances between a feature parameter of the input voice and 
those of voices of speakers other than the registered speaker, based on 
the analysis results and the voice; and 

determining whether or not the input voice matches the 
registered speaker corresponding to the input identity claim, 
30 the program further comprising: 

inputting a false acceptance error rate as a threshold, the false 
acceptance error rate being predetermined bv a system manager or a 
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user or adjustable depending on performance ; and 

obtaining a probability distribution of the interspeaker distances 
based on the calculated sneaker distances; 

wherein it is determined that the input voice is the voice of the 
5 registered person specified bv the identity claim, in the case where the 
calculated verification distance is included in a region defined bv the 
input false acceptance error rate in the probability distribution of the 
interspeaker distances. 
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